perm filename PATREC[4,KMC]11 blob
sn#104285 filedate 1974-05-28 generic text, type C, neo UTF8
COMMENT ā VALID 00002 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002
C00039 ENDMK
Cā;
PATTERN-MATCHING RULES FOR THE RECOGNITION OF
NATURAL LANGUAGE DIALOGUE EXPRESSIONS
Kenneth Mark Colby
and
Roger C. Parkison
INTRODUCTION
To recognize something is to identify it as an instance of
the "same again". This familiarity is possible because of recurrent
characteristics of the world which repeat themselves over and over
again. We shall describe an algorithm which recognizes recurrent
characteristics of natural language dialogue expressions. It utilizes
a multi-stage sequence of pattern-matching rules for progressively
transforming these input expressions into a pattern which eventually
best matches a more abstract stored pattern. The name of the stored
pattern has a pointer to the name of a response function in memory
which decides what to do once the input has been recognized. Here
we shall discuss only the recognizing functions, except for one
response function (anaphoric substitution) which interactively aids
the recognition process. Details of how the response functions
operate will be described in a future communication by William
Faught.
In constructing and testing a simulation of paranoid
processes, we were faced with the problem of reproducing paranoid
linguistic behavior in a teletyped diagnostic psychiatric interview.
The diagnosis of paranoid states, reactions or modes is made by
clinicians who judge a degree of correspondence between what they
observe linguistically in an interview and their conceptual model of
paranoid behavior. There exists a high degree of agreement among
psychiatrists about this conceptual model which relies mainly on what
an interviewee says and how he says it.
Natural language is a life-expressing code people use for
communication with themselves and others. In a real-life dialogue
such as a psychiatric interview, the participants have interests,
intentions, and expectations which are revealed in their linguistic
expressions. To produce effects on an interviewer which he would
judge similar to the effects produced by a paranoid patient , an
interactive simulation of a paranoid patient must be able to
demonstrate typical paranoid interview behavior. To achieve the
desired effects, the paranoid model must have the ability to deal
with the teletyped linguistic behavior of the interviewer.
There are a number of approaches one might consider for an
ideal handling of dialogue expressions. One approach would be
to construct a dictionary of all expressions which could possibly
arise in an interview. Associated with each expression would be its
interpretation, depending on dialogue context, which in turn must
somehow be defined. For obvious reasons, no one takes this approach
seriously. Instead of an expression dictionary, one might
construct a word dictionary and then use projection rules to yield an
interpretation of a sentence from the dictionary definitions. This
has been the approach adopted by most linguistic parsers based on
transformational or systemic grammars. Such a method performs
adequately as long as the dictionary involves only a few hundred
words, each word having only one or two senses, and the dialogue is
limited to a mini-world of a few objects and relations. But the
problems which arise in a real-time psychiatric interview conducted
in unrestricted English are too great for this method to be useful in
actual dialogues requiring immediate responses.
There is little consensus knowledge about how humans process
natural language. They can be shown to possess some knowledge of
grammar rules but this does not entail that they use a grammar in
interpreting and producing language. It seems implausible to us that
people possess full transformational grammars for processing
language. Language is what is recognized but the processes
involved may not be linguistic or grammatical. Originally
transformational grammars were not designed to "understand" a large
subset of English; they represented a formal method for deciding
whether a string is grammatical.
An analysis of what one's problem actually is should guide
the selection or invention of methods appropriate to that problem's
solution. Our problem was not to develop a consistent and general
theory of language nor to assert empirically testable hypotheses
about how people process language. Our problem was to design an
algorithm which recognizes what is being said in a dialogue and what
is being said about it in order to make a response such that a sample
of I-O pairs from the paranoid model is judged similar to a sample of
I-O pairs from paranoid patients. The design task was one of
artificial intelligence in which attempts are made to program
computers to perform human mind-like functions. New methods had to
be devised for an algorithm to participate in a human dialogue in a
paranoid-patient-like way. We are not claiming that our methods
represent the way people process language. We sought effective
methods which could operate efficiently in real time. Since our
methods provide a general way of many-to-one mapping from surface
expressions to a single stored pattern, these methods can be used by
any type of "host" system which takes natural language as input.
To perform the task of managing communicative uses and
effects of natural language, we developed a strategy of transforming
the input until a pattern is achieved which matches completely or
partially a more abstract stored pattern. This strategy has proved
adequate for our purposes a satisfactory percentage of the time. (No
one expects any method to be successful 100% of the time since not
even humans, the best natural language processors around, achieve
this level of performance). The power of this method for natural
language dialogues lies in its ability to ignore as irrelevant some
of what it recognizes and everything it does not recognize at all. A
conventional parser doing word-by-word, parts-of-speech analysis
fails when it cannot find one or more of the input words in its
dictionary. Because it must know every word, it is too fragile for
unrestricted dialogues.
In early versions of the paranoid model, (PARRY1), many (but
not all) of the pattern recognition mechanisms allowed the elements
of the pattern to be order independent. (Colby, Weber, and Hilf,
1971). For example, consider the following expressions:
(1) WHERE DO YOU WORK?
(2) WHAT SORT OF WORK DO YOU DO?
(3) WHAT IS YOUR OCCUPATION?
(4) WHAT DO YOU DO FOR A LIVING?
(5) WHERE ARE YOU EMPLOYED?
In PARRY1 a procedure would scan these expressions looking for an
information-bearing contentive such as "work", "for a living", etc.
If it found such a contentive along with a "you" or "your" in the
expression, regardless of word order, it would respond to the
expression as if it were a question about the nature of one's work.
This method correctly classifies the 5 sentences above.
Unfortunately, it includes the 2 examples below in the same category:
(6) DOES YOUR FATHER'S CAR WORK?
(7) HOW DID THINGS WORK OUT FOR YOU?
An insensitivity to word order has the advantage that lexical items
representing different parts of speech can represent the same
concept,e.g. "work" as noun or as verb. But a price is paid for this
resilience and elasticity. We found from experience that, since
English relies heavily on word order to convey the meaning of it
messages, the average penalty of misunderstanding (to be
distinguished from ununderdstanding), was too great. Hence in PARRY2,
as will be described shortly, the patterns require a specified word
order.
For high-complexity problems it is necessary to have
constraints. Diagnostic psychiatric interviews (and especially
those conducted over teletypes) have several natural constraints.
First, clinicians are trained to ask certain questions in certain
ways. This limits the number of patterns required to recognize
utterances about each topic. Second, only a few hundred standard
topics are brought up by interviewers who are trained to use everyday
expressions and especially those used by the patient himself. When
the interview is conducted over teletypes, expressions tend to be
shortened since the interviewer tries to increase the information
transmission rate over the slow channel of a teletype. Finally,
teletyped interviews represent written utterances. Utterances are
known to be highly redundant and unrecognized words can be ignored
without losing the meaning of the message. Utterances are loaded with
idioms, cliches, pat phrases, etc. - all being easy prey for a
pattern-matching approach. It is time-wasting and usually futile to
try to decode an idiom by analyzing the meanings of its individual
words.
We shall now describe the pattern-matching functions of the
algorithm in some detail. (See Fig. 1 for a diagram of the overall
flow of control).
OVERVIEW
PARRY2 has two primary modules. The first attempts to
RECOGNIZE the input and the second RESPONDS. This paper is primarily
about the RECOGNIZE module. It functions independently of the
RESPOND module except in the case of pronoun references, which the
RESPOND module provides to the RECOGNIZER on request.
The recognition module has 4 main steps:
1) Identify the words in the question and convert them to
internal synonyms.
2) Break the input into segments at certain bracketing words.
3) Match each segment (independently) to a stored pattern.
4) Match the resulting list of recognized segments to a stored
compound pattern.
Each of these steps, except the segmenting, throws away what it
cannot identify. When an utterance refers to a topic which PARRY2
has some ability to recognize, this increases the chance that it will
ultimately be recognized. Occasionally, a reference to an unknown
topic is mis-recognized as some familiar topic.
PREPROCESSING
Each word in the input expression is first looked up in a
dictionary of(currently) about 1900 entries which, for the sake of
speed, is maintained in core during run-time. (The complete language
recognition process requires less than one second of real time on a
time-shared DEC PDP-10). The dictionary, which was built empirically
from thousands of teletyped interviews with previous versions of the
model, consists of words, groups of words, and names of word-classes
they can be translated into. (See Fig. 2 for illustrative examples
of dictionary entries). The size of the dictionary is determined by
the patterns, i.e. a word is not included unless it appears in one or
more patterns. Entries in the dictionary reflect PARRY2's main
interests. If a word in the input is not in the dictionary, it is
dropped from the pattern being formed. Thus if the input were:
WHAT IS YOUR CURRENT OCCUPATION?
and the word "current" were not in the dictionary, the pattern at
this stage becomes:
( WHAT IS YOUR OCCUPATION )
The question-mark is thrown away as redundant since questions are
recognized by word order. (A statement followed by a question mark
(YOU GAMBLE?) is responded to in the same way as that statement
followed by a period.) Synonymic translations of words are made so
that the pattern becomes, for example:
( WHAT BE YOU JOB )
Some groups of words (i.e. idioms) are translated as a group
so that, for example, "for a living" becomes "for job". Certain other
juxtaposed words are contracted into a single word, e.g. "place of
birth" becomes "birthplace". This is done to deal with groups of
words which are represented as a single element in the stored
pattern, thereby preventing segmentation from occurring at the wrong
places, such as at a preposition inside an idiom or phrase. Besides
these contractions, certain expansions are made so that for example,
"DON'T" becomes "DO NOT" and "I'D" becomes "I WOULD".
Misspellings can be the bane of teletyped interviews for an
algorithm. Here they are handled in two ways. First, common
misspellings of important words are simply put in the dictionary.
Thus "yoy" is known to mean "you". The apostrophe is often omitted
from contractions so most contractions are recognized with or without
it. These common misspellings were gathered from over 4000 interviews
with earlier versions of the paranoid model.
Second, there are 5 common forms of typing error which are
checked systematically. These are:
1) Doubled letter
2) Extraneous letter
3) Forgetting to hold the "shift key" for an apostrophe
4) Hitting a nearby key on the keyboard
5) Transposing two letters in a word
The first three errors can be corrected by deleting the offending
character from the word. This is accomplished by deleting each
character in turn until the word becomes a recognized word. The
fourth type of error is only checked for 10 of the more common near
misses. These were also empirically determined and include letter
pairs like (T Y), (O P), (A S), and (N M). These methods are all
based on typing errors, but they also correct some legitimate English
spelling errors. Two-letter transposition corrects, for example,
"beleive" to "believe".
SEGMENTING
Another weakness in the crude pattern matching of PARRY1 was
that it took the entire input expression as its basic processing
unit. Hence if only two words were recognized in an eight word
utterance, the risk of misunderstanding was great. We needed a way of
dealing with units shorter than the entire input expression.
Expert telegraphists stay six to twelve words behind a
received message before transcribing it. Translators wait until they
have heard 4-6 words before they begin translating. Aided by a
heuristic from work in machine-translation (Wilks, 1973 ), we devised
a way of bracketing the pattern constructed up to this point into
shorter segments using prepositions, wh-forms, certain verbs, etc. as
bracketing points. (A list of the bracketing terms appears in Fig.
3). These tend to separate prepositional phrases and embedded
clauses from the main clause. The new pattern formed is termed
either "simple", having no delimiters within it, or "compound", i.e.,
being made up of two or more simple patterns. A simple pattern might
be:
( WHAT BE YOU JOB )
whereas a compound pattern would be:
(( WHY BE YOU ) ( IN HOSPITAL ))
Our experience with this method of segmentation shows that compound
patterns from teletyped psychiatric dialogues rarely consist of more
than three or four segments.
After certain verbs (See Fig. 4) a bracketing occurs to
replace the commonly omitted "THAT", such that:
( I THINK YOU BE AFRAID )
becomes
(( I THINK ) ( YOU BE AFRAID ))
MATCHING INDIVIDUAL SEGMENTS
Conjunctions serve only as markers for the segmenter and they
are dropped out after segmentation.
Negations are handled by extracting the "NOT" from the
segment and assigning a value to a global variable which indicates
that the expression is negative in form. When a pattern is finally
matched, this variable is consulted. Some patterns have a pointer to
a pattern of opposite meaning if a "NOT" could reverse their
meanings. If this pointer is present and a "NOT" was found, then
the pattern matched is replaced by its opposite, e.g. ( I not trust
you ) is replaced by the pattern ( I mistrust you ). We have not yet
observed the troublesome case of "he gave me not one but two
messages". (There is no need to scratch where it doesn't itch).
Substitutions are also made in certain cases. Some segments
contain pronouns which could stand for a number of different things
of importance to PARRY2. As we mentioned in the introduction,
there is one case in which the response functions of memory aid in
language recognition. One function of memory is to keep track of
the context in order to give pronouns and other anaphoras a correct
interpretation. For example, the segment:
( DO YOU AVOID THEM )
could refer to the Mafia, or racetracks, or other patients, depending
on the context. When such a segment is encountered, the pronoun is
replaced by its current anaphoric value as determined by the response
functions, and a more specific segment such as:
( DO YOU AVOID MAFIA )
is looked up.
There are other utterances, such as "Why did you do that?" or
just "Why?" (which might be regarded as a massive ellipsis), which
clearly refer back to previous utterances. These utterances match
very general patterns which identify the type of question without
indicating the exact topic. The response function which responds to
"Why?" consults the context to produce an appropriate answer.
The algorithm now attempts to match the segments with stored
simple patterns which currently number about 1700. First a complete
and perfect match is sought. When a match is found, the stored
pattern name has a pointer to the name of a response function in
memory which decides what to do further. If a match is not found,
further transformations of the segment are carried out and a "fuzzy"
match is tried.
For fuzzy matching at this stage, we adopted the heuristic
rule of dropping elements in the segment one at a time and attempting
a match each time. This heuristic allows ignoring familiar words in
unfamiliar contexts. For example, "well" is important in "Are you
well?" but meaningless in "Well are you?".
Deleting one element at a time results in, for example, the
pattern:
( WHAT BE YOU MAIN PROBLEM )
becoming successively:
(a) ( BE YOU MAIN PROBLEM )
(b) ( WHAT YOU MAIN PROBLEM )
(c) ( WHAT BE MAIN PROBLEM )
(d) ( WHAT BE YOU PROBLEM )
(e) ( WHAT BE YOU MAIN )
Since the stored pattern in this case matches (d), (e) would not be
constructed. We found it unwise to delete more than one element
since our segmentation method usually yields segments containing a
small (1-4) number of words.
Dropping an element at a time provides a probability
threshold for fuzzy matching which is a function of the length of
the segment. If a segment consists of five elements, four of the five
must be present in a particular order (with the fifth element missing
in any position) for a match to occur. If a segment contains four
elements, three must match - and so forth.
COMPOUND-PATTERN MATCH
When more than one simple pattern is detected in the input, a
second matching is attempted against about 500 compound patterns.
Certain patterns, such as ( HELLO ) and ( I THINK ), are dropped
because they are considered meaningless. If a complete match is not
found, then simple patterns are dropped, one at a time, from the
complex pattern. This allows the input,
(( HOW DO YOU COME ) ( TO BE ) ( IN HOSPITAL ))
to match the stored pattern,
(( HOW DO YOU COME ) ( IN HOSPITAL )).
If no match can be found at this point, the algorithm has
arrived at a default condition and the appropriate response functions
decide what to do. For example, in a default condition, the model
may assume control of the interview, asking the interviewer a
question, continuing with the topic under discussion or introducing a
new topic.
An annotated example of a diagnostic psychiatric interview is
presented in the appendix.
ADVANTAGES AND LIMITATIONS
As mentioned, one of the main advantages of a
pattern-matching strategy is that it can ignore as irrelevant both
some of what it recognizes and what it does not recognize at all.
There are several million words in English, each possessing one to
over a hundred senses. To construct a machine-usable word
dictionary of this magnitude is out of the question at this time.
Recognition of natural language input such as described above, allows
real-time interaction in a dialogue since it avoids becoming
ensnarled in combinatorial disambiguations and long chains of
inferencing which would slow a dialogue algorithm down to
impracticality, if it could even function at all. The price paid for
pattern-matching is that sometimes, but rarely, ambiguities slip
through. Eventually a rewrite interpreter must be added to pattern-
matching rules to deal with ambiguities. (Enea and Colby,1973).
A drawback to PARRY1 was that it reacted to the first pattern
it found in the input rather than characterizing the input as fully
as possible and then deciding what to do based on a number of tests.
Another practical difficulty with PARRY1 from a programmer's
viewpoint, was that elements of the patterns were strung out in
various procedures throughout the algorithm. It was often a
considerable chore for the programmer to determine whether a given
pattern was present and precisely where it was. In the
above-described method, the patterns are all collected in one part of
the data-base where they can easily be examined.
Concentrating all the patterns in the data base gives PARRY2
a limited "learning" ability. When an input fails to match any
stored pattern or matches an incorrect one, as judged by a human
operator, a pattern which matches the input can be put into the
data-base automatically. If the new pattern has the same meaning as
a previously stored pattern, the human operator must provide the name
of the appropriate response function. If he doesn't remember the
name, he may try to rephrase the input in a form recognizable to
PARRY2 and it will name the response function associated with the
rephrasing. These mechanisms are not "learning" in the commonly used
sense but they do allow a person to transfer his knowledge into
PARRY2's data-base with very little effort.
Informal observation thus far shows PARRY2's linguistic
recognition abilities to be quite superior to PARRY1's. A more
systematic and quantitative evaluation of performance will be carried
out in the near future.
REFERENCES
Colby, K.M., Weber, S., and Hilf, F.D. (1971). Artificial Paranoia.
ARTIFICIAL INTELLIGENCE, 2, 1-25.
Enea, H. and Colby, K.M. (1973). Idiolectic Language-analysis
For Understanding Doctor-patient Dialogues. THIRD
INTERNATIONAL JOINT CONFERENCE IN ARTIFICIAL INTELLIGENCE.
278-284.
Wilks, Y. (1973). An artificial intelligence approach to machine
translation. In COMPUTER MODELS OF THOUGHT AND
LANGUAGE, Schank, R. and Colby, K.M., Eds., W.H. Freeman,
San Francisco.